Goto

Collaborating Authors

 mean absolute percentage error


ChatBCG: Can AI Read Your Slide Deck?

Singh, Nikita, Balian, Rob, Martinelli, Lukas

arXiv.org Artificial Intelligence

With the advanced vision capabilities of GPT-4o and Gemini Flash, an important question arises regarding the accuracy of these functionalities in practical business applications. Our assumption was that multimodal models are good at reading and summarizing charts. When given an image of a slide deck, they do a good job of summarizing key insights from it, often including relevant data points. Existing research into this question has evaluated the efficacy of LLM's when parsing tables [3], concluding that the LLMs were highly sensitive to input prompts which drive performance. Other works also evaluate LLMs ability to reason and read mathematical graphs [2] and find that GPT models outperform alternatives. This paper aims to explore whether multimodal models perform well on a variant of this skill - answering straightforward questions that require the models to pick out a number from a slide deck.


Forecasting Lithium-Ion Battery Longevity with Limited Data Availability: Benchmarking Different Machine Learning Algorithms

Hilal, Hudson, Saha, Pramit

arXiv.org Artificial Intelligence

As the use of Lithium-ion batteries continues to grow, it becomes increasingly important to be able to predict their remaining useful life. This work aims to compare the relative performance of different machine learning algorithms, both traditional machine learning and deep learning, in order to determine the best-performing algorithms for battery cycle life prediction based on minimal data. We investigated 14 different machine learning models that were fed handcrafted features based on statistical data and split into 3 feature groups for testing. For deep learning models, we tested a variety of neural network models including different configurations of standard Recurrent Neural Networks, Gated Recurrent Units, and Long Short Term Memory with and without attention mechanism. Deep learning models were fed multivariate time series signals based on the raw data for each battery across the first 100 cycles. Our experiments revealed that the machine learning algorithms on handcrafted features performed particularly well, resulting in 10-20% average mean absolute percentage error. The best-performing algorithm was the Random Forest Regressor, which gave a minimum 9.8% mean absolute percentage error. Traditional machine learning models excelled due to their capability to comprehend general data set trends. In comparison, deep learning models were observed to perform particularly poorly on raw, limited data. Algorithms like GRU and RNNs that focused on capturing medium-range data dependencies were less adept at recognizing the gradual, slow trends critical for this task. Our investigation reveals that implementing machine learning models with hand-crafted features proves to be more effective than advanced deep learning models for predicting the remaining useful Lithium-ion battery life with limited data availability.


ALPHA: AnomaLous Physiological Health Assessment Using Large Language Models

Tang, Jiankai, Wang, Kegang, Hu, Hongming, Zhang, Xiyuxing, Wang, Peiyu, Liu, Xin, Wang, Yuntao

arXiv.org Artificial Intelligence

This study concentrates on evaluating the efficacy of Large Language Models (LLMs) in healthcare, with a specific focus on their application in personal anomalous health monitoring. Our research primarily investigates the capabilities of LLMs in interpreting and analyzing physiological data obtained from FDA-approved devices. We conducted an extensive analysis using anomalous physiological data gathered in a simulated low-air-pressure plateau environment. This allowed us to assess the precision and reliability of LLMs in understanding and evaluating users' health status with notable specificity. Our findings reveal that LLMs exhibit exceptional performance in determining medical indicators, including a Mean Absolute Error (MAE) of less than 1 beat per minute for heart rate and less than 1% for oxygen saturation (SpO2). Furthermore, the Mean Absolute Percentage Error (MAPE) for these evaluations remained below 1%, with the overall accuracy of health assessments surpassing 85%. In image analysis tasks, such as interpreting photoplethysmography (PPG) data, our specially adapted GPT models demonstrated remarkable proficiency, achieving less than 1 bpm error in cycle count and 7.28 MAE for heart rate estimation. This study highlights LLMs' dual role as health data analysis tools and pivotal elements in advanced AI health assistants, offering personalized health insights and recommendations within the future health assistant framework.


Pricing European Options with Google AutoML, TensorFlow, and XGBoost

Berger, Juan Esteban

arXiv.org Artificial Intelligence

Researchers have been using Neural Networks and other related machine-learning techniques to price options since the early 1990s. After three decades of improvements in machine learning techniques, computational processing power, cloud computing, and data availability, this paper is able to provide a comparison of using Google Cloud's AutoML Regressor, TensorFlow Neural Networks, and XGBoost Gradient Boosting Decision Trees for pricing European Options. All three types of models were able to outperform the Black Scholes Model in terms of mean absolute error. These results showcase the potential of using historical data from an option's underlying asset for pricing European options, especially when using machine learning algorithms that learn complex patterns that traditional parametric models do not take into account.


Modeling and Forecasting COVID-19 Cases using Latent Subpopulations

Vega, Roberto, Shah, Zehra, Ramazi, Pouria, Greiner, Russell

arXiv.org Artificial Intelligence

Classical epidemiological models assume homogeneous populations. There have been important extensions to model heterogeneous populations, when the identity of the sub-populations is known, such as age group or geographical location. Here, we propose two new methods to model the number of people infected with COVID-19 over time, each as a linear combination of latent sub-populations -- i.e., when we do not know which person is in which sub-population, and the only available observations are the aggregates across all sub-populations. Method #1 is a dictionary-based approach, which begins with a large number of pre-defined sub-population models (each with its own starting time, shape, etc), then determines the (positive) weight of small (learned) number of sub-populations. Method #2 is a mixture-of-$M$ fittable curves, where $M$, the number of sub-populations to use, is given by the user. Both methods are compatible with any parametric model; here we demonstrate their use with first (a)~Gaussian curves and then (b)~SIR trajectories. We empirically show the performance of the proposed methods, first in (i) modeling the observed data and then in (ii) forecasting the number of infected people 1 to 4 weeks in advance. Across 187 countries, we show that the dictionary approach had the lowest mean absolute percentage error and also the lowest variance when compared with classical SIR models and moreover, it was a strong baseline that outperforms many of the models developed for COVID-19 forecasting.


Machine learning forecasting: Why, what & how

#artificialintelligence

With customer expectations and preferences changing faster than ever, a deep understanding of the customer demand is essential to making the right decisions about marketing spend, sourcing, inventory, production, transportation, staffing, and more. Critical business measures like turnover, capital expenditure, risk evaluation, profit margins, cash flow, and capacity planning are all reliant on accurate demand forecasting, which ultimately can help businesses to estimate the total sales and revenue for a defined future. Typically, demand forecasting includes activities such as supply planning, product manufacturing planning (e.g., sourcing, R&D), and financial planning. The critical aspect of these planning activities is to understand product demands from customers and how to fulfil those demands in the most timely and efficient way. By capturing the variability of future demand through forecasting, businesses can predict customer behaviors more accurately and meet their demands with a higher level of confidence and significantly reduced lead times from order to delivery.


Using Time Series Analysis to Forecast Close Approaches to the Earth by Near-Earth Objects

#artificialintelligence

If we are to be struck by an impact event resulting in human extinction, it would most likely occur in the Spring or Fall. If you were to ask 100 people what they believed the greatest risk to human civilization is I would bet the top 3 answers would be nuclear war, global pandemic and global warming/climate change. However, less than 10 years ago a meteor with a diameter of approximately 20 meters and a mass of 10,000 tons exploded 30 km over the city Chelyabinsk in Russia. Although there were no fatalities, the blast was estimated to have resulted in $30 million worth of damages and injured 1,500 people. About 100 years previously, in 1908, a meteor 50–60 meters in size exploded over Siberia with the power of a 12 megaton explosion which destroyed about 2,200 squared kilometers of forest.


Midwifery Learning and Forecasting: Predicting Content Demand with User-Generated Logs

Guitart, Anna, del Río, Ana Fernández, Periáñez, África, Bellhouse, Lauren

arXiv.org Machine Learning

Every day, 800 women and 6,700 newborns die from complications related to pregnancy or childbirth. A well-trained midwife can prevent most of these maternal and newborn deaths. Data science models together with logs generated by users of online learning applications for midwives can help to improve their learning competencies. The goal is to use these rich behavioral data to push digital learning towards personalized content and to provide an adaptive learning journey. In this work, we evaluate various forecasting methods to determine the interest of future users on the different kind of contents available in the app, broken down by profession and region.


Using Time Series Analysis to predict NIFTY50 movements

#artificialintelligence

The NIFTY 50 index is National Stock Exchange of India's benchmark broad based stock market index for the Indian equity market. Full form of NIFTY is National Stock Exchange Fifty. It represents the weighted average of 50 Indian company stocks in 12 sectors and is one of the two main stock indices used in India, the other being the BSE Sensex. In this blog, we will see how we can use the various Time Series algorithms to predict how the NIFTY50 index will move over the next 30 days. To download the data, we can go to the Yahoo Finance site and download the historical data for the NIFTY50 index.


Automated Estimation of Total Lung Volume using Chest Radiographs and Deep Learning

Sogancioglu, Ecem, Murphy, Keelin, Scholten, Ernst Th., Boulogne, Luuk H., Prokop, Mathias, van Ginneken, Bram

arXiv.org Artificial Intelligence

Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 928 CXR studies were chosen from patients with pulmonary function test (PFT) results. The reference total lung volume was calculated from lung segmentation on CT or PFT data, respectively. This dataset was used to train deep-learning architectures to predict total lung volume from chest radiographs. The experiments were constructed in a step-wise fashion with increasing complexity to demonstrate the effect of training with CT-derived labels only and the sources of error. The optimal models were tested on 291 CXR studies with reference lung volume obtained from PFT. The optimal deep-learning regression model showed an MAE of 408 ml and a MAPE of 8.1\% and Pearson's r = 0.92 using both frontal and lateral chest radiographs as input. CT-derived labels were useful for pre-training but the optimal performance was obtained by fine-tuning the network with PFT-derived labels. We demonstrate, for the first time, that state-of-the-art deep learning solutions can accurately measure total lung volume from plain chest radiographs. The proposed model can be used to obtain total lung volume from routinely acquired chest radiographs at no additional cost and could be a useful tool to identify trends over time in patients referred regularly for chest x-rays.